e 3.40(a) shows an unpruned CART tree constructed for the Iris

her, 1936]. In this tree, five branch nodes were employed based

of four variables. Some variable was repeatedly used, such as

ngth. Importantly, the branch node Sepal.Length was a redundant

e far right branch node Petal.Length was also a redundant branch

e leaf nodes of these two branch nodes all ended up with the same

erefore, this tree needs to be pruned. The rpart package was

this pruning for this data. Figure 3.40(b) shows a pruned tree. It

en that only two variables and two branch nodes were sufficient

be the relationship between the feature variables and the species

his data set.

(a) (b)

The CART models generated for the Iris data. (a) The unpruned tree. (b) The

.

ext format of a pruned tree for the Iris data is shown below. It

at the root node (A) received the whole data set. This whole space

tioned using the first partitioning rule Petal.Length<2.45. After

ion, one subspace was pure for the Setosa class. This is noted by

(1 0 0) for this partition, which indicated no other classes except

etosa class in this subspace. The other subspace was mixed by

o classes of the flowers as indicated by a vector (0 0.5 0.5). It

that the half (50%) of this subspace was occupied by the

or flowers and the other half (50%) of this subspace was occupied

rginica flowers. This subspace was thus further partitioned by the